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WHAT IS CLAIMED IS: 

1. A method for estimating naturalness of synthesized 
speech, wherein naturalness is a subjective quality of 
synthesized speech, the method comprising: 

generating a set of synthesized utterances; 
subjectively rating each of the synthesized 
utterances; 

calculating a score for each of the synthesized 
utterances using an objective measure, the 
objective measure being a function of 
textual information derived from the 
utterances; 

ascertaining a relationship between the scores of 
the objective measure and subjective ratings 
of the synthesized utterances; and 

using the relationship to estimate naturalness of 
synthesized speech. 

2. The method of claim 1 wherein the objective 
measure comprises an indication of a position of a 
speech unit in a phrase. 

3. The method of claim 1 wherein the objective 
measure comprises an indication of a position of a 
speech unit in a word. 

4. The method of claim 1 wherein the objective 
measure comprises an indication of a category for a 
phoneme preceding a speech unit. 
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5. The method of claim 1 wherein the objective 
measure comprises an indication of a category for a 
phoneme following a speech unit. 

6. The method of claim 1 wherein the objective 
measure comprises an indication of a category for the 
tone of a preceding speech unit. 

7. The method of claim 1 wherein the objective 
measure comprises an indication of a category for the 
tone of a following speech unit. 

8. The method of claim 1 wherein the objective 
measure comprises an indication of a prosodic mismatch 
between successive speech units. 

9. The method of claim 1 wherein the objective 
measure comprises an indication of level of stress of a 
speech unit. 

10. The method of claim 1 wherein the objective 
measure score for each synthesized utterance is a 
function of a length of said each synthesized 
utterance. 

11. The method of claim 10 wherein the length 
comprises a number of speech units in an utterance. 

12. The method of claim 1 wherein calculating a score 
includes generating context vectors for each 
synthesized utterance wherein the context vectors 
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comprises at least two coordinates of textual 
information from a set including: 

an indication of a position of a speech unit in a 
phrase; 

an indication of a position of a speech unit in a 
word; 

an indication of a category for a phoneme 

preceding a speech unit; 
an indication of a category for a phoneme 

following a speech unit; 
an indication of a category for the tone of a 

preceding speech unit; 
an indication of a category for the tone of a 

following speech unit; and 
an indication of a level of stress of a speech 

unit; and 

an indication of a degree of coupling with a 
neighboring speech unit. 

13. The method of claim 12 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least three coordinates of textual 
information from the set. 

14. The method of claim 12 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least four coordinates of textual 
information from the set. 



15. The method of claim 12 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least five coordinates of textual 
information from the set. 

16. The method of claim 12 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least six coordinates of textual 
information from the set. 

17. The method of claim 12 wherein the objective 
measure includes an indication of prosodic mismatch of 
successive speech units. 

18. The method of claim 12 wherein the coordinates are 
weighted. 

19. A method for developing a speech synthesizer, the 
method comprising: 

obtaining a set of synthesized utterances from the 
speech synthesizer; 

subjectively rating naturalness of each of the 
synthesized utterances; 

calculating a score for each of the synthesized 
utterances using an objective measure, the 
objective measure being a function of 
textual information of speech units for each 
of the utterances; 
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ascertaining a relationship between the scores of 

the objective measure and ratings of the 

synthesized utterances; 
varying a parameter of the speech synthesizer; 
obtaining speech units for another utterance after 

the parameter of the speech synthesizer has 

been varied; and 
calculating a second score for said another 

utterance using the objective measure; and 
using the relationship and the second score to 

estimate naturalness of said another 

utterance. 

20. The method of claim 19 wherein obtaining 
speech units for another utterance includes obtaining 
speech units for a second set of utterances, wherein 
calculating a second score includes calculating 
corresponding scores for each of the utterances of 
the second set of utterances, and wherein using the 
relationship includes using the relationship to 
estimate naturalness of each of said second set of 
utterances . 

21. The method of claim 19 wherein the parameter 
comprises an amount of speech units available for 
synthesis . 

22. The method of claim 19 wherein the parameter 
comprises an algorithm for selecting speech units. 
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23. The method of claim 19 wherein calculating a score 
includes generating context vectors for each 
synthesized utterance wherein the context vectors 
comprises at least two coordinates of textual 
information from a set including: 

an indication of a position of a speech unit in a 
phrase; 

an indication of a position of a speech unit in a 
word; 

an indication of a category for a phoneme 

preceding a speech unit; 
an indication of a category for a phoneme 

following a speech unit; 
an indication of a category for the tone of a 

preceding speech unit; 
an indication of a category for a tone of a 

following speech unit; and 
an indication of a level of stress of a speech 

unit . 

24. The method of claim 23 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least three coordinates of textual 
information from the set. 

25. The method of claim 23 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least four coordinates of textual 
information from the set. 
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26. The method of claim 23 wherein calculating a score 
includes generating context vectors for each of the 
"synthesized utterances wherein the context vectors 
comprises at least five coordinates of textual 
information from the set. 

27. The method of claim 23 wherein calculating a score 
includes generating context vectors for each of the 
synthesized utterances wherein the context vectors 
comprises at least six coordinates of textual 
information from the set. 

28. The method of claim 23 wherein the objective 
measure includes an indication of prosodic mismatch of 
successive speech units. 

29. The method of claim 23 wherein the coordinates are 
weighted. 



